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Regulation of intrinsic noise in gene expression is essential for many cellular functions. Corre- 
spondingly, there is considerable interest in understanding how different molecular mechanisms of 
gene expression impact variations in protein levels across a population of cells. In this work, we ana- 
lyze a stochastic model of bursty gene expression which considers general waiting-time distributions 
governing arrival and decay of proteins. By mapping the system to models analyzed in queueing 
theory, we derive analytical expressions for the noise in steady-state protein distributions. The 
derived results extend previous work by including the effects of arbitrary probability distributions 
representing the effects of molecular memory and bursting. The analytical expressions obtained pro- 
vide insight into the role of transcriptional, post-transcriptional and post-translational mechanisms 
in controlling the noise in gene expression. 
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Regulation of gene expression is at the core of cellu- 
lar adaptation and response to changing environments. 
Given that the underlying processes are intrinsically 
stochastic, cellular regulation must be designed to con- 
trol variability (noise) in gene expression [T] . While noise 
reduction is essential in many cases, regulatory mecha- 
nisms can also exploit the intrinsic stochasticity to in- 
crease noise and generate phenotypic heterogeneity in a 
clonal population of cells '2]. Quantifying the contribu- 
tions of different sources of intrinsic noise using stochastic 
models of gene expression [3H5] is thus an important step 
towards understanding cellular processes and variations 
in cell populations. 

Several recent studies have focused on quantifying 
noise in gene expression. Experiments have shown that 
protein production often occurs in 'bursts' [6l [J and 
single-molecule measurements have also provided evi- 
dence for transcriptional bursting, i.e. production of mR- 
NAs in bursts [HtilOj. The analysis and interpretation of 
such experimental studies has been aided by the develop- 
ment of coarse-grained stochastic models of gene expres- 
sion. The simplest of these considers the basic processes 
(transcription, translation and degradation) as elemen- 
tary Poisson processes [1 1; with exponential waiting-time 
distributions. However, since these processes are known 
to involve multiple biochemical steps, the corresponding 
waiting-time distributions can be more general than the 
'memoryless' exponential distribution [12]. An important 
question then arises: how do gene expression mechanisms 
involving molecular memory effects influence the noise in 
protein distributions? 

Motivated by the preceding observations, we introduce 
a model including general waiting-time distributions for 
processes governing the arrival of bursts and the decay 
of proteins (termed 'gestation' and 'senescence' effects re- 
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FIG. 1. Reaction scheme for the underlying gene expression 
model. Production of mRNAs occurs in bursts (character- 
ized by random variable mb with arbitrary distribution) and 
each mRNA gives rise to a burst of proteins (characterized 
by random variable pb with arbitrary distribution) before it 
decays (with lifetime r^). The waiting-time distributions for 
burst arrival and decay of proteins are characterized by the 
functions f{t) and h{t) respectively. 



spectively [12j). The underlying reaction scheme for the 
models analyzed in this work is shown in Fig. 1. Pro- 
duction of mRNAs occurs in independent bursts and the 
time interval between the arrival of consecutive mRNA 
bursts is characterized by random variable T with corre- 
sponding probability density function (p.d.f) f(t). The 
number of mRNAs produced in a single transcriptional 
burst is characterized by the random variable mt. Each 
mRNA independently gives rise to a random number of 
proteins (characterized by random variable pb) before it 
is degraded. For the basic models of translation, pb fol- 
lows the geometric distribution [BJ [71 [T3] . However, more 
general schemes of gene expression (e.g. involving post- 
transcriptional regulation [14| ) can give rise to protein 
burst distributions that deviate significantly from a geo- 
metric distribution. Proteins are degraded independently 
and the waiting-time distribution for protein decay is 
characterized by the p.d.f h{t). 

In the limit that the mRNA lifetime (t„j) is much 
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shorter than the protein Hfetime (r^), i.e. ^ ^ 1, the 
evolution of cellular protein concentrations can be mod- 
eled by processes governing arrival and decay of proteins 
alone [131 [T3] . Unless otherwise stated, the analysis in 
this paper will focus on this 'burst' limit, in which pro- 
teins are considered to arrive in independent instanta- 
neous bursts arising from the underlying mRNA burst. 
In this limit, we have shown in recent work [16j that the 
processes involved in gene expression can be mapped on 
to models analyzed in queueing theory. In this mapping, 
individual proteins are the analogs of customers in queue- 
ing models. The bursty synthesis of proteins then corre- 
sponds to the arrival of customers in 'batches', whereas 
the protein decay-time distribution is the analog of the 
service-time distribution for each customer. Given that 
degradation of each protein is independent of others in 
the system, the process maps on to queueing systems 
with infinite servers. Correspondingly, the gene expres- 
sion model in Fig. 1 maps on to what is known as a 
GI-^ /G /oo system in the queueing literature. In this no- 
tation, the symbol G refers to the general waiting-time 
distribution and indicates that the customers arrive 
in batches of random size X, where X is drawn indepen- 
dently each time from an arbitrary distribution. 

The GI-^ /G/oo system has been analyzed in previ- 
ous work in queueing theory [17] . In the following, we 
briefly review the notation and relevant results from the 
queueing theory analysis. As in Fig. 1, f{t) and h{t) 
denote the p.d.f. for the arrival time and service time 
respectively, with F{t) and H{t) as the corresponding 
cumulative density functions (c.d.f). The distribution of 
batch size X has the corresponding generating function 
defined as A{z) = J^Zi ^(^ = i)^'- The fcth fac- 
torial moment of batch size X, denoted by Ak, is given 
by Ak — {d''A{z)/dz'')\z=i- The number of customers 
in service at time t is denoted by N{t) and analytical 
expressions have been derived for the r*^ binomial mo- 
ment Br{t) of N{t) [T7|. These results can be used to de- 
rive expressions for all the moments of N{t), for example 
E[N{t)] = Bi[t) and Var[N{t)] = 2B2{t)+Bi(t)- B^{t). 
In the following, we will focus on two general subcat- 
egories of the GI-^ /G/oo system for which closed-form 
analytical expressions can be derived for the mean and 
variance of steady-state protein distributions. These cor- 
respond to two cases: A) arbitrary distributions for ges- 
tation and bursting with a Poisson process governing 
protein degradation and B) arbitrary distributions for 
bursting and senescence with a Poisson process govern- 
ing burst arrival. 

Consider first case A, for which arbitrary gestation 
and bursting effects are included. In this case, the ran- 
dom variable T characterizing the time interval between 
bursts is drawn from an arbitrary p.d.f. f{t). The protein 
decay-time distribution h{t) is taken to be an exponential 
function with h[t) = iipe~^p* and the mean protein life- 
time is given by Tp = l//ip. The corresponding queueing 



system is GI-^ /M/oo where M indicates that the process 
of customer departure, which is the analog of protein de- 
cay, is Markovian. A{z) corresponds to the generating 
function of burst size distribution (determined by ran- 
dom variables mb and pb in Fig. 1) and N{t) denotes the 
number of proteins in the cell at time t. The previous 
analysis [T7] has derived expressions for the steady-state 
mean and variance corresponding to N = limt_j.oo N{t) 
for the GI^/M/oo queue as [15]: 



1 



Var[N] ^ E[N]{1 



1 - /lCmp) 



Ai - E[N] 



2Ai 



),(1) 



where (T) is the mean of p.d.f f{t) and /l(s) is the 
Laplace transform of f{t). 

To translate the result Eq.([T]) into an expression for 
the noise in protein distributions, we derive expressions 
for Ai and A2 in terms of variables characterizing mRNA 
and protein burst distributions. In general, each mRNA 
will produce a random number of proteins (pt) and fur- 
thermore the number of mRNAs in the burst is also a 
random variable (wb). The number of proteins produced 
in a single burst is thus a compound random variable. 
Correspondingly, using standard results from probability 
theory |19j . we derive the following equations for burst 
size parameters (^1 and A2) in terms of nib a-nd pb'. 

Ai = (mb) (pb) 

A2 = {fnb){al - {pb)) + {al^ + {mbf){pb)\ (2) 

where the symbols (..) and a represent the mean and 
standard deviation respectively. 

Using Eq. ^ , in combination with identification of the 
random variable N with the corresponding variable char- 
acterizing the number of proteins {ps), we obtain the fol- 
lowing expressions for the mean and coefficient of vari- 
ance (noise) of the steady-state protein distribution: 
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is denoted as the gestation factor. 

Different contributions to the noise in protein dis- 
tributions are highhghted in Eq.([3]): gestation effects, 
mRNA transcriptional bursting, and translational burst- 
ing from a single mRNA, which correspond to the terms 
Kg, cr"^^ / {■nib)'^ and (yp^/iPb)"^, respectively. The first two 
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[H . Using the approximation that the time-averaging 
factor is the same for general gestation and bursting dis- 
tributions, we obtain 
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FIG. 2. The noise vs yip{T) from analytical expressions and 
stochastic simulations. A) The time interval between consec- 
utive bursts is fixed and only 1 mRNA is produced each burst. 
The protein production is under post-transcriptional regula- 
tion [H] such that CTp^ = 0.67{pi,)^ -I- ijpb) and r„/rp 0.02. 
B) The time interval between bursts is drawn from a Gamma 
distribution and the number of mRNAs created in one burst 
is drawn from a Poisson distribution. The number of proteins 
created by each mRNA follows a geometric distribution. The 
parameters are Tm/r^ = 0.2, (mt) = 10, al^^HjUhf" — 0.1 
and a\l{Tf = 0.2. While Eq.(|5| agrees with simulations, 
the result from Ref. [12] is less accurate when /^^{T) is large. 



terms can be modified by transcriptional regulation and 
the last term can be tuned by post-transcriptional reg- 
ulation. It is noteworthy that each source contributes 
additively to the overall noise in the steady-state distri- 
bution. Moreover, while the noise due to gestation effects 
is independent of the degree of transcriptional bursting, 
the noise contribution from translational bursting is ef- 
fectively reduced by transcriptional bursting. 

While Eq.([3]) is valid for general gestation effects, it is 
of interest to consider specific examples. We consider the 
case such that there is a constant delay between arrival of 
consecutive mRNA bursts, i.e. the waiting-time distribu- 
tion is /(i) = 8(t — Td). In this case, the gestation factor 
is given by ifg = 2e"'"f^V(l - e"^"^"*) - S/^pT^ -b f . The 
corresponding expression for the noise in protein distri- 
butions Eq.(|3|, considering a general case which also in- 
cludes the effects of post-transcriptional regulation , 
is in excellent agreement with results from stochastic sim- 
ulations (Fig. 2A). It is noteworthy that Kg can be non- 
vanishing even though the time interval between consec- 
utive bursts is fixed (i.e. (t|, = 0). In contrast to previous 
work ^12i, which suggests that the contribution of gesta- 
tion effects to the noise vanishes when = 0, our result 
shows that Kg can be tuned from to 1 as fipT^ is varied. 

While the results derived above are valid in the limit 
Tm Tp, an exact expression for the noise in the general 
case (i.e. without invoking the condition t„i <Si Tp and for 
general gestation and bursting distributions) is difficult 
to obtain. However, a useful approximation can be ob- 
tained by noting that, for the basic gene expression mod- 
els, the exact result is obtained by scaling the terms in 
the bracket in Eq. (Isj) with a time-averaging factor ^ 



It is instructive to compare Eq.([5]) with the result 
derived in previous work |12) which assumes the ba- 
sic protein production reaction scheme such that — 
{Pb)'^ + {Pb)- Considering this specific case, we note that 
Eq.([5]) is identical to the previous result [H] apart from 
the terms corresponding to the gestation factor Kg. The 
connection to the previous result can be seen by expand- 
ing the Laplace transform, fLifJ-p), in terms of moments 
of T. By assuming ^p{T) is small and (T") scales as 
the n*'* power of (T) or less. Kg can be approximated by 
Kg « a\H^)'^ which corresponds to the previous result. 
Since the parameter l/(/ip(T)) measures the mean num- 
ber of bursts occurring during the protein lifetime, this 
indicates that the previous result |il2. is valid for the case 
of frequent bursting during a protein lifetime, and breaks 
down when bursts occur over larger time intervals (Fig. 
2B). 

We now consider case B, which corresponds to arbi- 
trary distributions for bursting and senescence effects 
along with exponential waiting-time distributions for 
burst arrival. For this case, we take the waiting-time for 
protein degradation to be drawn from an arbitrary dis- 
tribution characterized by p.d.f h(t) and c.d.f Hit). The 
waiting-time between consecutive bursts is characterized 
by an exponential distribution with j[t) — Ae~^*. The 
corresponding system, following the mapping to queueing 
theory, is the M-^ /G/oo queue. The steady-state mean 
and variance of N for this queue has been obtained in 
previous work |17) : 

/•oo 

E[N] = \Ai i [1 - H{t)]dt 
Jo 

/•oo 

Var[N] = E[N] + XA2 / [1 - H{t)fdt. (6) 
Jo 

By taking Eq.(|2]) and the relation (T) = I/A into ac- 
count, the mean and the noise for arbitrary senescence 
and bursting distribution can be derived as: 

iPs) = ^ [1 - H{t)]dt = ^(m,)(p,) 
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where 
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is denoted as the senescence factor. 

It is noteworthy Eq.Q and Eq.([3]) have multiple terms 
in common. The terms characterizing the noise from 
transcriptional and translational bursting remain un- 
changed. However, unlike the gestation factor that con- 
tributes to the total noise additively, the senescence fac- 
tor serves as a scaling factor for the total noise. While 
there is no obvious upper limit on the value of Kg, the 
upper bound for Kg is 2 as is evident from Eq. ([s]) . In gen- 
eral, as the distribution h{t) grows more sharply peaked, 
the Kg value increases. When h{t) becomes a delta func- 
tion, Ks reaches its maximum value. 

The general results derived in this work will serve as 
useful inputs for the analysis and interpretation of diverse 
experimental studies of gene expression. Some examples 
are: 1) Recent experiments on single-cell studies of HIV-1 
viral infections have focused on the frequency and degree 
of transcriptional bursting 21 . For such studies, the de- 
rived results can be used to relate measurements of inter- 
arrival waiting-time distributions and burst distributions 
to the noise in protein distributions. 2) Experimental 
data and computational models of the cell-cycle in yeast 
indicate that modeling the basic processes of gene ex- 
pression as Poisson processes gives rise to unrealistically 
large noise in protein distributions [55] , thereby suggest- 
ing that regulatory schemes which change distributions 
to reduce the noise are employed by the cell. The analyt- 
ical expressions derived highlight different contributions 
to noise and can thus provide insight into how different 
regulatory schemes can lead to noise reduction. 3) More 
generally, the results derived can be used in the analysis 
of inverse problems, i.e. using experimental measure- 
ments of intrinsic noise to determine parameters of the 
underlying kinetic models. Such efforts, in turn, can lead 
to further insights into cellular factors that impact gene 
regulation, based on experimnetal observations of noise 
in gene expression. 

In summary, we have analyzed the noise in protein 
distributions for general stochastic models of gene ex- 
pression. The present work extends previous analysis by 
deriving analytical results for the noise in protein distri- 
butions for arbitrary gestation, senescence and bursting 
mechanisms. The expressions obtained provide insight 
into how different sources contribute to the noise in pro- 
tein levels which can lead to phenotypic heterogeneity in 
isogenic populations. The results derived will thus serve 
as useful inputs for the analysis and interpretation of 
experiments probing stochastic gene expression and its 
phenotypic consequences. At a broader level, this work 
demonstrates the benefits of developing a mapping be- 



tween models of stochastic gene expression and queue- 
ing systems which has potential applications for research 
in both fields. The extensive analytical approaches and 
tools developed in queueing theory can now be employed 
to analyze stochastic processes in gene expression. It is 
also anticipated that future analysis of regulatory mech- 
anisms for gene expression will lead to new problems and 
challenges for queueing theory. 
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